Introduction#

The global network of computers that provides information and communication facilities using some standardized protocol is known as the Internet. The World Wide Web (WWW) refers to the interlinked resources that are stored across these interconnected computing machines. In broad categories, these machines are further classified into clients and servers. The client requests some resources from the server, and in return, the server provides the requested resources to the client. This communication between the client and server is carried out with the help of an API. Before diving deep into the discussion of API, let's understand the basic concepts of the World Wide Web and the Hypertext Transfer Protocol (HTTP). These concepts further paved the way for REST APIs, which we’ll discuss in the coming chapters.

There are many technologies that one might use for client-server communication, but we’ll focus on the HTTP, RPC, and WebSocket protocols in this chapter. We’ll start with WWW because that is the primary use case for an HTTP protocol. The other two (RPC and WebSockets) are used extensively in different use cases. We believe these three collectively cover the major concepts of client-server communications abstractions.

The World Wide Web#

The World Wide Web (WWW) is a hypertext-based information system that interlinks documents spread and stored across millions of machines all over the Internet. It’s commonly known as the web or w3.

The World Wide Web was launched by the European Center for Nuclear Research (CERN) in 1989. The initial idea was to support giant teams working across a dozen or more countries and time zones using a constantly changing collection of reports, pictures, drawings, blueprints, and other materials produced by particle physics research. Tim Berners-Lee, a physicist at CERN, proposed the idea of a web of connected documents. The first (text-based) prototype was implemented 18 months later. At the Hypertext '91 conference, other academics were drawn to a public demonstration which inspired Marc Andreessen of the University of Illinois to create Mosaic's first graphical browser in February 1993. This was just the beginning of the World Wide Web, which in just a few decades has transformed millions of lives and revolutionized the world of information and networking.

Point to Ponder

Question

What is the difference between the Internet and the web?

Hide Answer

The Internet is a network of networks with a series of interconnected nodes that can communicate with each other. In contrast, the web is one of the applications built on the Internet.

The basic terminologies of the Web#

In a broad aspect, the web is a combination of the following four basic ideas:

  • Client-server model: The client-server model is the principal concept behind the web. The client and server are two different entities connected together via the Internet. The client requests a resource while the server provides the requested resource to the client. An example of the client-server can be a web browser (client) and Domain Name System (DNS) server. When a user types a URL in the browser—for example, www.educative.io—the DNS returns the web server's IP address. In this case, it’s 104.18.3.119.

  • Resource identifiers: This is a unique identifier to locate a resource such as files, documents, or other sources over the Internet. For example, https://www.educative.io/courses/grokking-modern-system-design-software-engineers-managers/g7EwEyNnR6D redirects us to a lesson (resource) in Educative’s course on system design.

  • Hypertext and hypermedia: Hypertext is a text that is not linear, like with printed books or pages, but instead contains links to other documents where a reader can float freely from one document to another. Similarly, hypermedia is an extension of the hypertext, where along with the text (hypertext), the documents contain multimedia contents such as graphics, animations, images, videos, audio, and interactive elements (such as embedded programs). For example, some websites may have images—when a user clicks on that image, the website directs the user to another web page.

  • Markup language: The markup language includes a set of symbols inserted in a text document to configure and style its structure, formatting, or connection between its parts. The following widget shows an example of the HTML markup language.

An example of the HTML markup language

Note: Browsers render web pages by parsing markup languages, such as the one given above.

Web protocols#

Since the web follows the client-server model, there should be well-defined protocols to make their communication interoperable. On the web, the client requests some data that is provided by the server in a proper format. The client application, usually a browser, displays the data received from the server in a specific format—a web page. This information flow between clients and servers is directed by a protocol called Hypertext Transfer Protocol (HTTP).

Application layer protocols such as the File Transfer Protocol (FTP), Simple Message Transport Protocol (SMTP), HTTP, and so on, use lower layer protocols like Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Internet Protocol (IP) to provide services to end users. This course discusses all the protocols that are deemed essential.

How does the web work?#

From the users' perspective, the web is a collection of web pages that contain links to hundreds of other objects stored across the Internet on different machines. A computing machine can either be a client, server, or both. Clients run a web browser software while the servers run a special software known as a web server software, which they can use to respond to a request from the client.

Web server software runs on clusters of commodity machines or a few high-end machines, giving every user the illusion that these servers only serve that client. The web servers get services from other back-end services, such as database services, blob storage, and other application servers, as shown in the figure below.

Web server
Web server
Database
Databa...
Data server
Data server
Ads and Trackers
Ads and Trackers
HTTP request
HTTP request
HTTP response
HTTP response
Blob store
Blob store
Clients
Clients
Application servers
Application servers
API
API
1
1
5
5
2
2
3
3
4
4
2
2
3
3
4
4
Viewer does not support full SVG 1.1
A bird’s-eye view of the working of World Wide Web

The Internet Protocol (IP) is the unique address of every computing device on the internet. It’s used to uniquely identify the device on the Internet or a local network. When a client has an IP address and sends a request to a server, the server provides the required data to the same client identified by 5-tuples (source IP, source port, protocol number, destination IP, and destination port). Upon receiving the data, the browser reacts accordingly and displays the data as a web page within the browser's window, including images, videos, and other formats, such as scripts.

The client and the server repeatedly interact with each other for each web page the client needs. Both the client and server follow a series of steps to fulfill a request, as discussed in the following sections.

What happens on the client side?#

A client accesses different resources on the World Wide Web. Let’s first understand the structure of a web resource before describing the steps needed to get that resource.

Resources over the Internet are identified using a Uniform Resource Identifier (URI). The URI consists of three parts, as shown below.

https://www.educative.io/courses/grokking-modern-system-design-software-engineers-managers/g7EwEyNnR6D
https://www.educative.io/courses/grokking-modern-system-design-software-engineers-managers/g7EwEyNnR6D
Protocol
Protocol
Domain name
Domain name
Unique path to a specific page
Unique path to a specific page
Viewer does not support full SVG 1.1
A URI of a web page for the domain www.educative.io
  • Protocol: The HTTPS protocol is used to fetch the web page.

  • Domain name: The DNS name shows where the web page is located.

  • Unique path: The unique path determines the web page that needs to be accessed within a domain.

The browser performs the following sequence of steps while fetching a web page or a resource:

  1. When a user clicks on a link, the browser determines the web page URL, for example, www.educative.io.

  2. In the next step, the browser asks the DNS server for the IP address of www.educative.io.

  3. The DNS server provides the web server's IP address where the web page resides.

  4. The browser makes a TCP connection with the server that has the specified IP address provided by the DNS server.

  5. The browser asks for the web page using the HTTPS protocol.

  6. The server sends the page in response to HTTPS requests. If the fetched web page consists of URLs of other objects that are needed for display, the browser fetches all of them using the same process.

  7. The browser displays the web page to the user.

  8. After fetching all the required web pages and the relevant objects, the TCP connections are terminated.

Created with Fabric.js 3.6.6
The client clicks on a link

1 of 8

Created with Fabric.js 3.6.6
The browser asks the DNS server for the IP address of www.educative.io

2 of 8

Created with Fabric.js 3.6.6
The DNS server responds with the IP address of www.educative.io

3 of 8

Created with Fabric.js 3.6.6
The browser initiates a TCP connection with the Educative server

4 of 8

Created with Fabric.js 3.6.6
The browser sends an HTTPS request to the Educative server

5 of 8

Created with Fabric.js 3.6.6
The Educative server responds with the required document

6 of 8

Created with Fabric.js 3.6.6
The browser renders the contents of the received document

7 of 8

Created with Fabric.js 3.6.6
The TCP connection is terminated

8 of 8

What happens on the server side?#

Servers constantly listen to entertain clients' requests. When a server receives a request from the client consisting of the path to a web page, the server retrieves the web page from the database or cache and then returns it to the client. The sequence of steps a server performs is given below:

  1. The server accepts a TCP connection from the client.

  2. The server receives a request for a resource identified by a unique URI.

  3. The server populates the web page from the database or cache.

  4. The web page's content is sent back to the client as an HTTP response.

  5. The TCP connection is terminated.

The web has been continuously evolving since Tim Berners-Lee first introduced it in 1989. The first version, Web 1.0, primarily delivered static content and focused on providing static web pages using HTML markup and HTML forms.

In 1999 Darcy DiNucci coined the term Web 2.0, later popularized by Tim O’Reilly and Dale Dougherty at the first Web 2.0 Conference back in 2004. The transmission of static content was replaced in Web 2.0 with a model emphasizing interactive involvement, searching and tagging, AJAX and partial page refreshes, blogging and RSS feeds, social networking, online bookmarking, content sharing, and so on. Web 2.0 made the Internet a practical application platform. Further, technologies based on Web 2.0 have made it possible for user interface (UI) components—such as widgets, dynamic JavaScript, movies, and HTML snippets—to be shared and embedded in HTML content.

Web 3.0 is a third-generation open and fair network where people can participate without concern for privacy and security. It integrates the strength of Artificial Intelligence (AI) and Big Data, which makes the internet intelligent and paves the way for the semantic web. The era of Web 3.0 is when blockchain will enable the web to become decentralized. Blockchain will be the driving force behind the web’s data management and collection procedures in Web 3.0.

Summary#

In this lesson, we described the World Wide Web (or the web) and how it works. We also discussed how this project was initiated at CERN in 1989 and how it evolved over the years into modern technologies. Next, we described the four essential components of the web: the client-server model, hypertext and hypermedia, resource identifiers, and markup language. We also discussed the steps involved when a client requests a resource from the server over the internet with the collaboration of different protocols in the process. We have explored the evolution of the web and shed light on the field’s advancements.

Quiz

Question

What’s the difference between static and dynamic web pages?

Hide Answer

Static web pages are based on Web 1.0, where content stored on the server is delivered to the client as is. On the other hand, dynamic web pages emerged because of Web 2.0, where the end users interact with the page, and dynamic content gets rendered on the client-side. Dynamic content is generated by a server-side script or application and can change automatically. Furthermore, dynamic web pages are generated by server-side scripts (such as PHP, JSP, and so on), which are interpreted by the server, and the result is given to the user.

Network Sockets as a Foundation for Communication

Hypertext Transfer Protocol (HTTP)